Search of Performance Inefficiencies in Message Passing Applications with KappaPI 2 Tool
نویسندگان
چکیده
Performance is a crucial issue of parallel/distributed applications. One kind of useful tools, in this context, are the automatic performance analysis tools, that help developers in some of the phases of the performance tuning process. KappaPI 2 is an automatic performance tool, with open knowledge about typical inefficiencies in message passing applications, and it is able to detect and analyze these inefficiencies, and then make suggestions to the developer about how to improve their application behavior.
منابع مشابه
SCALASCA Parallel Performance Analyses of SPEC MPI2007 Applications
The SPEC MPI2007 1.0 benchmark suite provides a rich variety of message-passing HPC application kernels to compare the performance of parallel/distributed computer systems. Its 13 applications use a representative cross-section of programming languages (C/C++/ Fortran, often combined) and MPI programming patterns (e.g., blocking vs. non-blocking vs. persistent point-to-point communication, with...
متن کاملTitle: Infrastructure for Performance Tuning Mpi Applications Infrastructure for Performance Tuning Mpi Applications
An abstract of the thesis of Kathryn Marie Mohror for the Master of Science in Computer Science presented November 13, 2003. Title: Infrastructure For Performance Tuning MPI Applications Clusters of workstations are becoming increasingly popular as a low-budget alternative for supercomputing power. In these systems, message-passing is often used to allow the separate nodes to act as a single co...
متن کامل1 Performance Tool Support for MPI - 2 on Linux 1
Programmers of message-passing codes for clusters of workstations face a daunting challenge in understanding the performance bottlenecks of their applications. This is largely due to the vast amount of performance data that is collected, and the time and expertise necessary to use traditional parallel performance tools to analyze that
متن کاملProtocol-Dependent Message-Passing Performance on Linux Clusters
In a Linux cluster, as in any multi-processor system, the inter-processor communication rate is the major limiting factor to its general usefulness. This research is geared toward improving the communication performance by identifying where the inefficiencies lie and trying to understand their cause. The NetPIPE utility is being used to compare the latency and throughput of all current message-...
متن کاملTorusBFS: A Novel Message-passing Parallel Breadth-First Search Architecture on FPGAs
Graphs are a fundamental data structure used extensively in numerous domains. In graph-based applications, Breadth-First Search (BFS) is a key component which suffers from long latency of memory accesses. In this paper, we present a novel message passing parallel BFS architecture namely TorusBFS on field-programmable gate array (FPGA). By utilizing the on-chip memories to store the visitation s...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2006